\documentclass{acm_proc_article-sp}
%\documentclass[times, 10pt,twocolumns]{article}
\usepackage{times}
%\usepackage[english]{algorithm2e}
\usepackage{algorithm}
\usepackage{algpseudocode}
%\usepackage[named]{algo}
%\algref{<algorithm>}{<line>}
\newtheorem{theorem}{Theorem}
%\newcounter{Observation}
\newtheorem{definition}[theorem]{Definition}
\newtheorem{Observation}[theorem]{Observation}
\def\candidate{{\cal C}}
\def\comment#1{}
\usepackage{graphicx}
\input{psfig}


\pagestyle{empty}

\begin{document}

\title{Finding Semantics in Time Series}

%\numberofauthors{1}
% \author{
% \alignauthor Peng Wang $^\dag$\hspace{1cm}Haixun Wang $^\ddag$\hspace{1cm}Wei Wang $^\dag$\\
% \ \\
%        \affaddr{\hspace{1.35cm}$^\dag$Fudan University\hspace{4.3cm}$^\ddag$IBM T. J. Watson Research Center\hspace{.5cm}}\\
%        \affaddr{\hspace{1cm}Shanghai, China\hspace{5cm}Hawthorne, NY 10533, USA}\\
%        \affaddr{\{pengwang5,weiwang1\}@fudan.edu.cn\hspace{2cm} \mbox{\hspace{1cm}}haixun@us.ibm.com\mbox{\hspace{1cm}}}
% }


\maketitle \thispagestyle{empty} \begin{abstract}

  In order to understand the internal dynamics of a complex system, we
  often start with the analysis of its output or its log. We track a
  system's resource consumption (CPU, memory, message queues of
  different types, etc) to help avert system failures; we examine
  economic indicators to assess the severity of a recession; we
  monitor a patient's heart rate or EEG for disease diagnosis. In many
  such applications, time series data is involved. Much work has been
  devoted to pattern discovery from time series data, but not much has
  attempted to use the time series to unveil a system's internal
  dynamics.  In this paper, we go beyond learning patterns from time
  series data. We focus on obtaining a better understanding of its
  data generating mechanism, and we regard patterns and their temporal
  relations as organic components of the hidden
  mechanism. Specifically, we propose to model time series data using
  a novel pattern-based Hidden Markov model (pHMM), which aims at
  revealing a global picture of the system that generates the time
  series data. We propose an iterative approach to refine pHMMs
  leanred from the data. In each iteration, we use the current pHMM to
  guide time series segmentation and clustering, which enables us to
  learn a more accurate pHMM.  Furthermore, we propose three pruning
  strategies to speed up the refinement process. Empirical results on
  real datasets demonstrate the feasibility and effectiveness of the
  proposed approach.
\end{abstract}

\input{introduction}

\input{preliminary}
\input{initial}
\input{refine}

\input{application}
\input{Experiment}
\input{related}

%\comment{
\section{Conclusion}
\label{sec:conclusion}


In this paper, we propose a pattern-based hidden Markov model (pHMM)
for time series data to learn the dynamics of the system that
generates the time series. The biggest difference between the pHMM and
the traditional HMM is that in pHMM, we use learned patterns as
observations. We propose a method that learns the patterns and the HMM
based on the patterns simultaneously. Furthermore, we propose three
pruning strategies to speed up the learning process. With pHMM, we are
able to perform the pattern based tasks, such as trend prediction and
general correlation detection. Empirical results on real datasets
demonstrate the feasibility and effectiveness of the proposed
approach. In our future work, we plan to extend it to multiple
dimensional datasets, and stream applications.

{\renewcommand{\baselinestretch}{0.92}
\normalsize
\bibliographystyle{plain}
\bibliography{haixun}
}
\end{document}
